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EXAMINER'S AMENDMENT 

1 . An examiner's amendment to the record appears below. Should the changes and/or 
additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 
1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the 
payment of the issue fee. 

2. Authorization for this examiner's amendment was given in a telephone interview with 
Mr. Edward L. Pencoske on 03/27/2008. 

3. The following claims have been amended: 

1 . (cancelled). 

2. A method for balancing the work load of an n-dimensional array of processing elements(PEs), 
wherein each dimension of said array includes said processing elements arranged in a plurality of 
lines and wherein each of said processing elements has a local number of tasks associated 
therewith, the method comprising: 

balancing a work load across at least one line of processing elements in a first dimension 
by redistributing the tasks amongst the processing elements in said line; 

balancing a work load across at least one line of processing elements in a next dimension 
by redistributing the tasks amongst the processing elements in said line; and 
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repeating said balancing at least one line of processing elements in a next dimension by 
redistributing the task among the processing elements in said line for each dimension of said n- 
dimensional array until the work load is balanced across all said processing elements; and 
wherein said balancing a work load comprises: 

calculating a total number of tasks for said line, wherein said total number of tasks for 
said line equals the sum of said local number of tasks for each of said processing elements on 
said line; 

notifying each of said processing elements on said line of said total number of tasks for 
said line; 

calculating a local mean number of tasks for each of said processing elements on said 

line; 

calculating a local deviation from local mean number for each of said processing 
elements on said line; 

determining a first local cumulative deviation for each of said processing elements on 
said line; 

determining a second local cumulative deviation for each of said processing elements on 
said line; and 

redistributing tasks among said processing elements on said line in response to at least 
one of said first local cumulative deviation and said second local cumulative deviation. 

3. The method of claim 2 wherein two or more lines in at least one of said first dimension and 
said next dimension are balanced in parallel. 
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4. The method of claim 2 wherein said calculating a total number of tasks for said line comprises 
sequentially summing said local number of tasks for each of said processing elements on said 
line from a first end of said line to a second end of said line. 

5. The method of claim 2 wherein said calculating said total number of tasks for said line 

/ .V l 

includes solving the equation V= ^ v, ,where V represents said total number of tasks for said 

/ o 

line, N represents the number of processing elements on said line, and v, represents said local 
number of tasks for a local PE, on said line. 

6. The method of claim 2 wherein said notifying step includes passing said total number of tasks 
from a second end of said line to a first end of said line. 

7. The method of claim 2 wherein said calculating a local mean number of tasks includes solving 
the equation M, =Trunc((V+ E r )/N), where M represents said local mean for a local processing 
element PE r on said line, jV represents the total number of PEs on said line, Fis the total number 
of tasks, and Er is a number in the range of 0 to (N-l). 

8. The method of claim 7 wherein each processing element has a differentia value. 
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9. The method of claim 7 wherein said Trunc function is responsive to E, such that said total 
number of tasks for said line is equal to the sum of the local mean number of tasks for each 
processing element on said line. 

10. The method of claim 7 wherein said local mean M =Trunc((V+ E,)/N) for each XocdXPErOn 
said line is equal to either Xor (X+l), where X is equal to local mean. 

1 1 . The method of claim 2 wherein said calculating a local deviation for each processing element 
on said line includes finding a difference between said local number of tasks for each PE r and 
said local mean number of tasks for each PEr . 

12. The method of claim 2 wherein said determining a first local cumulative deviation includes 
sequentially summing said local deviations for each PEr from a first end of said line to an 
adjacent upstream PEr - i on said line. 

13. The method of claim 2 wherein said determining a second local cumulative deviation 
includes finding a difference between the negative of said local deviation for each PE r and said 
first local cumulative deviation for each PEr . 

14. The method of claim 2 wherein said redistributing tasks among said processing elements on 
said line comprises: 
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transferring a task from a local PEr to a left-adjacent PE, i if said first local cumulative 
deviation for said local PE, is a negative value; 

transferring a task from said local Pis, to a right-adjacent PE r + 1 if said second local 
cumulative deviation for said local PEris a negative value. 

15. The method of claim 2 wherein said redistributing tasks among said processing elements on 
said line comprises: 

transferring a task from a local PE, to a left-adjacent PEr i if said second local 
cumulative deviation for said local PEr is a positive value; 

transferring a task from said local PEr to a right-adjacent PEr + 1 if said first local 
cumulative deviation for said local PEr is a positive value. 

16. The method of claim 2 wherein said calculating a local mean number of tasks; said 
calculating a local deviation; said determining a first local cumulative deviation; said 
determining a second local cumulative deviation; and said redistributing tasks are completed in 
parallel for each processing element on said line. 

17. The method of claim 16 wherein said calculating a local mean number of tasks; said 
calculating a local deviation; said determining a first local cumulative deviation; said 
determining a second local cumulative deviation; and said redistributing tasks are completed in 
parallel for each line in a selected dimension. 
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19. The method of claim 2 wherein said calculating a local deviation, said determining a first 
local cumulative deviation, said determining a second local cumulative deviation, and said 
redistributing tasks among said processing elements are repeated until said local deviation, said 
first local cumulative deviation, and said second local cumulative deviation for each of said 
processing elements is zero. 

20. A method for balancing a work load across one dimension of an n-dimensional array of 
processing elements(PEs), wherein each of said n-dimensions is traversed by a plurality of lines 
and wherein each of said lines has a plurality of processing elements with a local number of tasks 
associated therewith, the method comprising: 

balancing said plurality of lines in one dimension by redistributing tasks amongst the 
processing elements in each of said plurality of lines; 

balancing said plurality of lines in a next higher dimension; 

repeating said balancing said plurality of lines in a next higher dimension for each 
remaining dimension of said n-dimensional array, wherein each of said balanced lines includes 
PEs with either a number of local tasks equal to X or a number of local tasks equal to (X+l), 
where X equals a local mean; 

substituting the value zero (0) for each processing element having X local number of 

tasks; 
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substituting the value one (1) for each processing element having (X+l) local number of 
tasks; and 

shifting said values for each processing element within said balanced lines until a sum of 
said processing elements relative to a second dimension has only two different values, wherein 
shifting said values represent moving a task. 

21. (cancelled). 

22. The method of claim 20 wherein said balancing said plurality of lines in one dimension 
comprises: 

calculating a total number of tasks present within at least one of said lines; 
notifying each processing element on said line of said total number of tasks for said line; 
determining each processing element's share of said total number of tasks on said line; 
calculating a local deviation from said previous steps; 

determining a first local cumulative deviation for each processing element on said line 
using said local deviation; 

determining a second local cumulative deviation for each processing element on said line 
using said local deviation; 

redistributing tasks among each processing element on said line in response to at least 
one of said first local cumulative deviation and said second local cumulative deviation. 
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23. The method of claim 22 wherein said notifying each processing element comprises: serially 
summing said total number of tasks present on said line; and transmitting said total number of 
tasks to each processing element on said line. 

24. The method of claim 22 wherein said determining each processing element's share of said 
total number of tasks comprises: 

calculating a local mean number of tasks for each processing element on said line; and 
calculating a local deviation from said local mean number of tasks for each processing 

element on said line by finding the difference between said local number of tasks and said local 

mean number of tasks for each processing element on said line. 

25. The method of claim 24 wherein said calculating a local mean number of tasks for each 
processing element on said line comprises using a rounding function M =Trunc((V+ E,)/N), 
where M, represents said local mean of a local processing elements PEr , TV represents the total 
number of processing elements on said line, Vis the total number of tasks, and E r represents a 
number in the range of 0 to (N-l). 

26. The method of claim 25 wherein said Trunc function is responsive to Er such that said total 
number of tasks for said line is equal to the sum of the local mean number of tasks for each of 
said processing elements in said line. 
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27. The method of claim 25 wherein said local mean M =Trunc((V+ E r )/N) for each local 
processing element on said line is equal to either X or (X+l), where X is equal to a local mean. 

28. The method of claim 22 wherein said determining a first local cumulative deviation for each 
processing element on said line includes summing said local deviations for each upstream 
processing element on said line. 

29. The method of claim 22 wherein said determining a second local cumulative deviation for 
each processing element on said line includes finding the difference between the negative of said 
local deviation and said first local cumulative deviation for each processing element on said line. 

30. The method of claim 22 wherein said redistributing tasks among each processing element on 
said line in response to at least one of said first local cumulative deviation and said second local 
cumulative deviation comprises: 

transferring a task from a first processing element on said line to a second processing 
element on said line if said first local cumulative deviation for said first processing element is a 
negative value; and 

transferring a task from said second processing element on said line to said first 
processing element on said line if said first local cumulative deviation for said second processing 
element is a positive value. 
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3 1 . The method of claim 22 wherein said redistributing tasks among each processing element on 
said line in response to at least one of said first local cumulative deviation and said second local 
cumulative deviation comprises: 

transferring a task to a first processing element on said line from a second processing 
element on said line if said second local cumulative deviation for said first processing element is 
a negative value; and 

transferring a task to said second processing element on said line from said first 
processing element on said line if said second local cumulative deviation for said second 
processing element is a positive value. 

32. The method of claim 24 wherein said calculating a local deviation, said determining a first 
local cumulative deviation, said determining a second local cumulative deviation, and said 
redistributing tasks among said processing elements are repeated until said local deviation, said 
first local cumulative deviation, and said second local cumulative deviation for each of said 
processing elements is zero. 

33. (cancelled). 

34. A computer memory storing a set or instructions which, when executed, perform method for 
balancing a work load across one dimension of an n-dimensional array of processing 
elements(PEs), wherein each of said n-dimensions is traversed by a plurality of lines and where 
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each of said lines has a plurality processing elements with a local number of tasks associated 
therewith, the method comprising: 

balancing said plurality of lines in one dimension by redistributing tasks amongst the 
processing elements in each of said plurality of lines; 

balancing said plurality of lines in a next higher dimension; 

repeating said balancing said plurality of lines in a next higher dimension for each 
remaining dimension of said n-dimensional array, wherein each of said balanced lines includes 
PEs with either a number of local tasks equal to X or a number of local tasks equal to (X+l), 
where X equals a local mean; 

substituting the value zero (0) for each processing element having X local number of 

tasks; 

substituting the value one (1) for each processing element having (X+l) local number of 
tasks; and 

shifting said values for each processing element within said balanced lines until a sum of 
said processing elements relative to a second dimension has only two different values, wherein 
shifting said values represent moving a task. 

Conclusion 

4. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to ABDULLAH AL KAWSAR whose telephone number is 
(571)270-3169. The examiner can normally be reached on 7:30am to 5:00pm, EST. 
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5. If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Meng Ai T. An can be reached on 571-272-3756. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

6. Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

Abdullah- Al Kawsar 
Patent Examiner 
ART Unit 2195. 



/Thomas Lee/ 

Supervisory Patent Examiner, Art Unit 2115 



